41 research outputs found

    Using parallel corpora for translation-oriented term extraction

    Get PDF
    In many scientific, technological or political fields terminology and the production of up-to- date reference works is lagging behind, which causes problems to translators and results in inconsistent translations. Parallel corpora of texts already translated can be used as a resource for automatic extraction of terms and terminological collocations. The paper describes how a methodology for multi-word term extraction and bilingual conceptual mapping was developed for Slovene-English terms. We used word-to-word alignment to extract a bilingual glossary of single-word terms, and for multi-word terms two methods were tested and compared. The statistical method is broadly applicable but gives results of very limited use, while the method of syntactic patterns extracts highly useful terminological phrases, however only from a tagged corpus. A vision of further development is given and how these methods might be incorporated into existing translation tools

    Language in the age of dataism

    Get PDF
    The digital age brings dramatic changes to language and communication; its effects can be seen in the ways we use language, the channels we use to communicate and the manners in which ideas are spread. From the other end of the spectrum, our linguistic behaviour, communications and knowledge are transformed into data which can be used or bought to feed intelligent technologies. The article presents a bird's eye view of this dynamics of change, first by focusing on the impact of digitisation on language itself, further by analysing current trends in the language industry where traditional services are being replaced by technology- and data-driven solutions, and finally by exploring the impact of these technologies on man and society at large. We make a case for digital linguistics as an interdisciplinary field of study which adopts a human-centred approach to the sociolinguistic, technological, economic, infrastructural and ethical issues emerging with regard to language in the digital age

    Terminologija odnosov z javnostmi: korpus — luščenje — terminološka podatkovna zbirka

    Get PDF
    V prispevku prikazujemo analizo luščenja eno- in večbesednih terminoloških kandidatov, ki smo ga izvedli za potrebe priprave terminološke podatkovne zbirke odnosov z javnostmi na podlagi korpusa KoRP z luščilnikom LUIZ. Podrobneje se posvečamo dvojemu: (a) izluščenim enobesednim samostalniškim terminološkim kandidatom, katerih seznam primerjamo s pogostostnim seznamom samostalnikov v korpusu KoRP in vrednotimo glede na terminološkost, kot sta jo prepoznala dva področna strokovnjaka, ter (b) izluščenim večbesednim nizom z glagolskim in samostalniškim jedrom. Nadgrajeno metodo luščenja in izboljšan prikaz rezultatov smo dopolnili še z analizo priklica. Potrdili oz. ugotovili smo, da je v primerjavi s pogostostnim seznamom terminološki potencial enot v zgornjem delu seznama izluščenih samostalnikov večji, da imajo izluščeni glagolski besedni nizi predvsem kolokacijsko vrednost, ne pa tudi terminološke, in da so najbolj terminološko produktivni vzorci luščenja samostalniških zvez z naslednjo zgradbo: [pridevnik + samostalnik], [pridevnik + in + pridevnik + samostalnik] ter [pridevnik + pridevnik + samostalnik]. Analiza priklica je pokazala predvsem nizko stopnjo strinjanja med obema področnima strokovnjakoma, sicer pa je bil priklic razmeroma visok

    Classificatory Role of Adjectives in Karstology

    Get PDF
    Na temelju jednojezičnog korpusa na hrvatskom jeziku provedena je analiza kombinacijskog potencijala ključnih naziva kako bi se odredila relevantna pojmovna obilježja za kategorizaciju krških fenomena. Korpusni rezultati ukazali su na važnu ulogu pridjeva u određivanju geomorfoloških pojmova i njihovo svrstavanje unutar taksonomije. Predloženim modelom organizacije specijaliziranih pojmova i njihovih modifikatora u radu se nastoji predvidjeti odnos između nominalnih osnova i pridjevskih modifikatora uz pretpostavku da je skup pojmovnih odnosa uvjetovan semantičkom kategorijom pojma, tj. imeničke osnove. Analiza potvrđuje pretpostavku prema kojoj su atributi nužan element definicijskih obrazaca krških fenomena. Osim toga rezultati istraživanja ilustriraju povezanost kombinacijskog potencijala članova pojedine klase i vrijednosti atributa po kojima se članovi međusobno razlikuju.Based on the monolingual Croatian corpus, we analysed the combinatory potential of key terms to determine the relevant conceptual characteristics for the categorization of karst phenomena. The corpus results suggested an important role of adjectives in defining geomorphological terms and their classification within a taxonomy. The proposed model of the organization of specialized terms and their modifiers has been pursued to predict the relationship between nominal bases and adjective modifiers, assuming that a set of conceptual relationships is conditioned by the semantic category of the term. The analysis confirms that attributes are an important element of the defining patterns as well as the values expressed by adjectives

    Annotation, exploitation and evaluation of parallel corpora: TC3 I

    Get PDF
    Exchange between the translation studies and the computational linguistics communities has traditionally not been very intense. Among other things, this is reflected by the different views on parallel corpora. While computational linguistics does not always strictly pay attention to the translation direction (e.g. when translation rules are extracted from (sub)corpora which actually only consist of translations), translation studies are amongst other things concerned with exactly comparing source and target texts (e.g. to draw conclusions on interference and standardization effects). However, there has recently been more exchange between the two fields – especially when it comes to the annotation of parallel corpora. This special issue brings together the different research perspectives. Its contributions show – from both perspectives – how the communities have come to interact in recent years

    Annotation, exploitation and evaluation of parallel corpora: TC3 I

    Get PDF
    Exchange between the translation studies and the computational linguistics communities has traditionally not been very intense. Among other things, this is reflected by the different views on parallel corpora. While computational linguistics does not always strictly pay attention to the translation direction (e.g. when translation rules are extracted from (sub)corpora which actually only consist of translations), translation studies are amongst other things concerned with exactly comparing source and target texts (e.g. to draw conclusions on interference and standardization effects). However, there has recently been more exchange between the two fields – especially when it comes to the annotation of parallel corpora. This special issue brings together the different research perspectives. Its contributions show – from both perspectives – how the communities have come to interact in recent years

    Annotation, exploitation and evaluation of parallel corpora: TC3 I

    Get PDF
    Exchange between the translation studies and the computational linguistics communities has traditionally not been very intense. Among other things, this is reflected by the different views on parallel corpora. While computational linguistics does not always strictly pay attention to the translation direction (e.g. when translation rules are extracted from (sub)corpora which actually only consist of translations), translation studies are amongst other things concerned with exactly comparing source and target texts (e.g. to draw conclusions on interference and standardization effects). However, there has recently been more exchange between the two fields – especially when it comes to the annotation of parallel corpora. This special issue brings together the different research perspectives. Its contributions show – from both perspectives – how the communities have come to interact in recent years

    Annotation, exploitation and evaluation of parallel corpora: TC3 I

    Get PDF
    Exchange between the translation studies and the computational linguistics communities has traditionally not been very intense. Among other things, this is reflected by the different views on parallel corpora. While computational linguistics does not always strictly pay attention to the translation direction (e.g. when translation rules are extracted from (sub)corpora which actually only consist of translations), translation studies are amongst other things concerned with exactly comparing source and target texts (e.g. to draw conclusions on interference and standardization effects). However, there has recently been more exchange between the two fields – especially when it comes to the annotation of parallel corpora. This special issue brings together the different research perspectives. Its contributions show – from both perspectives – how the communities have come to interact in recent years

    Annotation, exploitation and evaluation of parallel corpora: TC3 I

    Get PDF
    Exchange between the translation studies and the computational linguistics communities has traditionally not been very intense. Among other things, this is reflected by the different views on parallel corpora. While computational linguistics does not always strictly pay attention to the translation direction (e.g. when translation rules are extracted from (sub)corpora which actually only consist of translations), translation studies are amongst other things concerned with exactly comparing source and target texts (e.g. to draw conclusions on interference and standardization effects). However, there has recently been more exchange between the two fields – especially when it comes to the annotation of parallel corpora. This special issue brings together the different research perspectives. Its contributions show – from both perspectives – how the communities have come to interact in recent years
    corecore